Wikipedia and Artificial Intelligence

نویسندگان

  • Razvan Bunescu
  • Evgeniy Gabrilovich
چکیده

Starting from the observation that certain communities have incentive mechanisms in place to create large amounts of unstructured content, we propose in this paper an original model which we expect to lead to the large number of annotations required to semantically enrich Web content at a large scale. The novelty of our model lies in the combination of two key ingredients: the effort that online communities are making to create content and the capability of machines to detect regular patterns in user annotation to suggest new annotations. Provided that the creation of semantic content is made easy enough and incentives are in place, we can assume that these communities will be willing to provide annotations. However, as human resources are clearly limited, we aim at integrating algorithmic support into our model to bootstrap on existing annotations and learn patterns to be used for suggesting new annotations. As the automatically extracted information needs to be validated, our model presents the extracted knowledge to the user in the form of questions, thus allowing for the validation of the information. In this paper, we describe the requirements on our model, its concrete implementation based on Semantic MediaWiki and an information extraction system and discuss lessons learned from practical experience with real users. These experiences allow us to conclude that our model is a promising approach towards leveraging semantic annotation.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Experimental Approach for Collecting Snippets Describing the Relations between Wikipedia Articles

In this paper, we deal with a simple scenario: a student Bob wants to know why ”mathematics” is very important to ”physics”, or in a more specific case, why ”differential equations” play a prominent role in the study of ”fluid dynamics”. In a different way, the scenario can also be stretched: Bob already had enough knowledge about ”artificial intelligence” and now he wants to learn about ”seman...

متن کامل

Extracting Multilingual Dictionaries for the Teaching

This paper describes a method for creating multilingual dictionaries using Wikipedia as a resource. A lucky strike on the road to multilingual information retrieval, the main idea is simple: taking the titles of Wikipedia pages in English and then finding the titles of the corresponding articles in other languages produces a multilingual dictionary in all those languages. While the page content...

متن کامل

Advertising Keyword Suggestion Using Relevance-Based Language Models from Wikipedia Rich Articles

When emerging technologies such as Search Engine Marketing (SEM) face tasks that require human level intelligence, it is inevitable to use the knowledge repositories to endow the machine with the breadth of knowledge available to humans. Keyword suggestion for search engine advertising is an important problem for sponsored search and SEM that requires a goldmine repository of knowledge. A recen...

متن کامل

Extracting Common Sense Knowledge from Wikipedia

Much of the natural language text found on the web contains various kinds of generic or “common sense” knowledge, and this information has long been recognized by artificial intelligence as an important supplement to more formal approaches to building Semantic Web knowledge bases. Consequently, we are exploring the possibility of automatically identifying “common sense” statements from unrestri...

متن کامل

RACAI's QA System at the Romanian-Romanian Multiple Language Question Answering (QA@CLEF2008) Main Task

This paper describes the participation of the Research Institute for Artificial Intelligence, Romanian Academy (RACAI) to the Multiple Language Question Answering Main Task at the CLEF 2008 competition. We present our Question Answering system answering Romanian questions from Romanian Wikipedia documents focusing on the implementation details. The presentation will also emphasize the fact that...

متن کامل

Structures of Knowledge from Wikipedia Networks

Knowledge is useless without structure. While the classification of knowledge has been an enduring philosophical enterprise, it recently found applications in computer science, notably for artificial intelligence. The availability of large databases allowed for complex ontologies to be built automatically, for example by extracting structured content from Wikipedia. However, this approach is su...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008